## # A tibble: 19 x 2
##      Q29     n
##    <dbl> <int>
##  1    13   915
##  2    NA   771
##  3     5   745
##  4     4   636
##  5     8   351
##  6     3   180
##  7    11   173
##  8    16   151
##  9    10   128
## 10     7   119
## 11    17   114
## 12     1    84
## 13    12    71
## 14    18    69
## 15     6    55
## 16     2    21
## 17    14    13
## 18     9     6
## 19    15     3
## # A tibble: 5 x 2
##     Q32     n
##   <dbl> <int>
## 1     1  1038
## 2     2  1122
## 3     3  1121
## 4     4   702
## 5    NA   622
## # A tibble: 5 x 2
##     Q32     n
##   <dbl> <int>
## 1     1    72
## 2     2    81
## 3     3    80
## 4     4    75
## 5    NA   463
## # A tibble: 13 x 2
##      Q33     n
##    <dbl> <int>
##  1     1  1004
##  2     2    63
##  3     3   143
##  4     4   904
##  5     5    60
##  6     6    68
##  7     7    66
##  8     8   252
##  9     9   419
## 10    10   420
## 11    11   224
## 12    12   388
## 13    NA   594
## # A tibble: 13 x 2
##      Q33     n
##    <dbl> <int>
##  1     1   947
##  2     4   835
##  3    10   399
##  4     9   382
##  5    12   351
##  6     8   237
##  7    11   199
##  8    NA   133
##  9     3   119
## 10     6    62
## 11     5    57
## 12     7    57
## 13     2    56
## # A tibble: 6 x 2
##     Q37     n
##   <dbl> <int>
## 1     1  2985
## 2     2  1016
## 3     3    29
## 4     4    44
## 5  1234     1
## 6    NA   530

Drop NAs for specific questions and filter out disciplines with fewer than 30 (the cutoff) students in sample

## # A tibble: 18 x 2
##      Q29     n
##    <dbl> <int>
##  1    13   644
##  2     5   540
##  3     4   443
##  4     8   243
##  5    16   120
##  6     3   111
##  7    11   111
##  8     7    95
##  9    10    95
## 10    17    81
## 11     1    70
## 12    12    57
## 13    18    42
## 14     6    35
## 15     2    16
## 16    14     8
## 17     9     3
## 18    15     1
## # A tibble: 14 x 2
##      Q29     n
##    <dbl> <int>
##  1    13   644
##  2     5   540
##  3     4   443
##  4     8   243
##  5    16   120
##  6     3   111
##  7    11   111
##  8     7    95
##  9    10    95
## 10    17    81
## 11     1    70
## 12    12    57
## 13    18    42
## 14     6    35

Major counts and percentages

Q29 major n pct_total cumulat_pct
13 Mec 644 23.97 23.97
5 Che 540 20.10 44.06
4 Civ 443 16.49 60.55
8 Ele 243 9.04 69.59
16 Softw 120 4.47 74.06
3 Bio 111 4.13 78.19
11 Ind 111 4.13 82.32
7 Comp 95 3.54 85.86
10 Env/Eco 95 3.54 89.39
17 Str/Arc 81 3.01 92.41
1 Aer/Oce 70 2.61 95.01
12 Mat 57 2.12 97.13
18 Gen 42 1.56 98.70
6 Con 35 1.30 100.00

Gender counts overall

Q37 n pct_total
1 1973 73.43
2 678 25.23
3 15 0.56
4 21 0.78

fill in 0s for NAs for specific items (Q1, Q3, Q5)

Drop majors with low counts (below 30 students in sample)

## # A tibble: 14 x 2
##      Q29     n
##    <dbl> <int>
##  1    13   618
##  2     5   523
##  3     4   433
##  4     8   240
##  5    16   118
##  6     3   110
##  7    11   108
##  8     7    93
##  9    10    90
## 10    17    78
## 11     1    70
## 12    12    55
## 13    18    40
## 14     6    32
## # A tibble: 14 x 2
##    major       n
##    <chr>   <int>
##  1 Mec       618
##  2 Che       523
##  3 Civ       433
##  4 Ele       240
##  5 Softw     118
##  6 Bio       110
##  7 Ind       108
##  8 Comp       93
##  9 Env/Eco    90
## 10 Str/Arc    78
## 11 Aer/Oce    70
## 12 Mat        55
## 13 Gen        40
## 14 Con        32

Drop majors with NA as major

## # A tibble: 14 x 2
##    major       n
##    <chr>   <int>
##  1 Mec       618
##  2 Che       523
##  3 Civ       433
##  4 Ele       240
##  5 Softw     118
##  6 Bio       110
##  7 Ind       108
##  8 Comp       93
##  9 Env/Eco    90
## 10 Str/Arc    78
## 11 Aer/Oce    70
## 12 Mat        55
## 13 Gen        40
## 14 Con        32

Clustering Process (two-step process using UMAP + HDBSCAN)

First perform dimension reduction using UMAP

## NULL
##           [,1]      [,2]
## [1,]  15.62332 -15.97910
## [2,]  11.12283 -18.44035
## [3,] -34.20802  18.18004
## [1]  15.623323  11.122833 -34.208024 -34.419584   6.551877 -34.582958
## [1] -15.97910 -18.44035  18.18004  18.60377 -23.96864  18.41222

Next, perform clustering with HDBSCAN

## HDBSCAN clustering for 2608 objects.
## Parameters: minPts = 120
## The clustering contains 6 cluster(s) and 241 noise points.
## 
##    0    1    2    3    4    5    6 
##  241 1080  575  175  147  264  126 
## 
## Available fields: cluster, minPts, cluster_scores, membership_prob,
##                   outlier_scores, hc

Join the dataframes back together again

## # A tibble: 7 x 2
##   cluster     n
##     <dbl> <int>
## 1       1  1080
## 2       2   575
## 3       5   264
## 4       0   241
## 5       3   175
## 6       4   147
## 7       6   126

Views of when climate change will affect different groups broken down by cluster assignments

Looking for patterns in the clusters

Understanding cluster compositions more

First create a dataframe with the rankings for each cluster, where a lower ranking means students in the cluster think climate change will affect more categories sooner.

## # A tibble: 7 x 3
##   cluster_time_rank cluster cluster_avg
##               <int>   <dbl>       <dbl>
## 1                 1       1       0.386
## 2                 2       4       3.46 
## 3                 3       0       4.76 
## 4                 4       5       5.14 
## 5                 5       6       7.13 
## 6                 6       2       9.24 
## 7                 7       3      17.4

Set clustering colors for all plots

Look at distribution of cluster for how when they think each community may be affected by global warming

Same information but faceted with clusters

** This is a good plot for seeing that a cluster’s beliefs about effects of global warming on different populations at different times vary in a clear pattern

Overall count for clusters

Distribution of college experiences by climate change time impact cluster (items from Q6, Q7, Q8, Q9)

Attempt to combine all tests together

Table of Kruskal Wallis test results

coll_exp_item coll_exp_item_name statistic p.value parameter method
Q6a Extracurr: Eng. research 10.2672493 0.0361587 4 Kruskal-Wallis rank sum test
Q6d Extracurr: Work/volunteer in dev. cou. 11.4765919 0.0216990 4 Kruskal-Wallis rank sum test
Q6e Extracurr: Work for eng. co. 12.7560859 0.0125312 4 Kruskal-Wallis rank sum test
Q6i Extracurr: Travel w/ int. ser. gr. 1.1376456 0.8882542 4 Kruskal-Wallis rank sum test
Q6j Extracurr: Part of env. sus. gr. 17.7670286 0.0013704 4 Kruskal-Wallis rank sum test
Q7a_tally Course top: Energy supply 3.3153187 0.5065088 4 Kruskal-Wallis rank sum test
Q7b_tally Course top: Energy demand 1.9897052 0.7376525 4 Kruskal-Wallis rank sum test
Q7c_tally Course top: Climate change 8.9512570 0.0623294 4 Kruskal-Wallis rank sum test
Q7d_tally Course top: Terrorism + war 1.5727449 0.6655854 3 Kruskal-Wallis rank sum test
Q7e_tally Course top: Water supply 8.4595963 0.0761214 4 Kruskal-Wallis rank sum test
Q7f_tally Course top: Pop. growth 2.2057888 0.6979696 4 Kruskal-Wallis rank sum test
Q7g_tally Course top: Food avail. 4.9021092 0.1791071 3 Kruskal-Wallis rank sum test
Q7h_tally Course top: Disease 1.5455360 0.8185459 4 Kruskal-Wallis rank sum test
Q7i_tally Course top: Poverty 7.1030371 0.0686851 3 Kruskal-Wallis rank sum test
Q7j_tally Course top: Sus. dev. 16.9109312 0.0020115 4 Kruskal-Wallis rank sum test
Q7k_tally Course top: LCA 4.4471837 0.3488562 4 Kruskal-Wallis rank sum test
Q7l_tally Course top: Bio-mimicry 3.0037680 0.3910446 3 Kruskal-Wallis rank sum test
Q7m_tally Course top: Env deg. 13.9254268 0.0075369 4 Kruskal-Wallis rank sum test
Q7n_tally Course top: Culturally app. sol. 7.2501404 0.1232453 4 Kruskal-Wallis rank sum test
Q7o_tally Course top: Opp. for future gen. 12.2974620 0.0152711 4 Kruskal-Wallis rank sum test
Q7p_tally Course top: Female pioneers 5.5485642 0.2354935 4 Kruskal-Wallis rank sum test
Q7q_tally Course top: Under-rep of fem. 8.9579497 0.0621592 4 Kruskal-Wallis rank sum test
Q7r_tally Course top: Under-rep of min. 7.1316553 0.1290923 4 Kruskal-Wallis rank sum test
Q7s_tally Course top: Eng. careers/stages 10.6811766 0.0303908 4 Kruskal-Wallis rank sum test
Q7t_tally Course top: Benefits of being eng. 9.7037805 0.0457243 4 Kruskal-Wallis rank sum test
Q7u_tally Course top: Student stories 4.3729155 0.3578835 4 Kruskal-Wallis rank sum test
Q7v_tally Course top: Teacher stories 4.6219896 0.3283267 4 Kruskal-Wallis rank sum test
Q8j Design course: Concepts to cont. issues 7.3542516 0.1183106 4 Kruskal-Wallis rank sum test
Q8n Design course: Concepts to help ppl 5.5861029 0.2322640 4 Kruskal-Wallis rank sum test
Q9a Sus. minor 3.4440431 0.0634802 1 Kruskal-Wallis rank sum test
Q9b Design for ppl in need 0.2529607 0.6149980 1 Kruskal-Wallis rank sum test
Q9c Design w/ int. ser. 3.1641257 0.0752727 1 Kruskal-Wallis rank sum test

Re-do Q9 analysis since it they are binary items

Question 9 items (binary outcomes)

Q9a - Did you minor in or have a concentration related to sustainability?

## # A tibble: 3 x 2
##   Q9a_bin     n
##   <chr>   <int>
## 1 No       2278
## 2 Yes       286
## 3 <NA>       44
##    
##      No Yes
##   1 936 128
##   2 122  22
##   3 208  31
##   4 232  26
##   5 113  12
##   6 508  54
##   7 159  13
## 
##  Pearson's Chi-squared test
## 
## data:  cont_table
## X-squared = 8.2958, df = 6, p-value = 0.2172

Q9b - Did your most recent in-major engineering design project contribut to helping people in need?

## # A tibble: 3 x 2
##   Q9b_bin     n
##   <chr>   <int>
## 1 No       1788
## 2 Yes       773
## 3 <NA>       47
##    
##      No Yes
##   1 729 335
##   2 113  31
##   3 167  72
##   4 180  76
##   5  92  31
##   6 382 180
##   7 125  48
## 
##  Pearson's Chi-squared test
## 
## data:  cont_table
## X-squared = 8.8485, df = 6, p-value = 0.1823

Q9c - Did you most recent in-major engineering design course include an international service component?

## # A tibble: 3 x 2
##   Q9c_bin     n
##   <chr>   <int>
## 1 No       2316
## 2 Yes       241
## 3 <NA>       51
##    
##      No Yes
##   1 964  98
##   2 134   9
##   3 218  21
##   4 231  25
##   5 117   6
##   6 499  62
##   7 153  20
## 
##  Pearson's Chi-squared test
## 
## data:  cont_table
## X-squared = 7.4818, df = 6, p-value = 0.2786

End cluster analysis